|
In-place matrix transposition, also called in-situ matrix transposition, is the problem of transposing an ''N''×''M'' matrix in-place in computer memory, ideally with ''O''(1) (bounded) additional storage, or at most with additional storage much less than ''NM''. Typically, the matrix is assumed to be stored in row-major order or column-major order (i.e., contiguous rows or columns, respectively, arranged consecutively). Performing an in-place transpose (in-situ transpose) is most difficult when ''N'' ≠ ''M'', i.e. for a non-square (rectangular) matrix, where it involves a complicated permutation of the data elements, with many cycles of length greater than 2. In contrast, for a square matrix (''N'' = ''M''), all of the cycles are of length 1 or 2, and the transpose can be achieved by a simple loop to swap the upper triangle of the matrix with the lower triangle. Further complications arise if one wishes to maximize memory locality in order to improve cache line utilization or to operate out-of-core (where the matrix does not fit into main memory), since transposes inherently involve non-consecutive memory accesses. The problem of non-square in-place transposition has been studied since at least the late 1950s, and several algorithms are known, including several which attempt to optimize locality for cache, out-of-core, or similar memory-related contexts. ==Background== On a computer, one can often avoid explicitly transposing a matrix in memory by simply accessing the same data in a different order. For example, software libraries for linear algebra, such as BLAS, typically provide options to specify that certain matrices are to be interpreted in transposed order to avoid data movement. However, there remain a number of circumstances in which it is necessary or desirable to physically reorder a matrix in memory to its transposed ordering. For example, with a matrix stored in row-major order, the rows of the matrix are contiguous in memory and the columns are discontiguous. If repeated operations need to be performed on the columns, for example in a fast Fourier transform algorithm (e.g. Frigo & Johnson, 2005), transposing the matrix in memory (to make the columns contiguous) may improve performance by increasing memory locality. Since these situations normally coincide with the case of very large matrices (which exceed the cache size), performing the transposition in-place with minimal additional storage becomes desirable. Also, as a purely mathematical problem, in-place transposition involves a number of interesting number theory puzzles that have been worked out over the course of several decades. 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「In-place matrix transposition」の詳細全文を読む スポンサード リンク
|